Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
bioRxiv ; 2024 Apr 13.
Artículo en Inglés | MEDLINE | ID: mdl-38645134

RESUMEN

Missense variants can have a range of functional impacts depending on factors such as the specific amino acid substitution and location within the gene. To interpret their deleteriousness, studies have sought to identify regions within genes that are specifically intolerant of missense variation 1-12 . Here, we leverage the patterns of rare missense variation in 125,748 individuals in the Genome Aggregation Database (gnomAD) 13 against a null mutational model to identify transcripts that display regional differences in missense constraint. Missense-depleted regions are enriched for ClinVar 14 pathogenic variants, de novo missense variants from individuals with neurodevelopmental disorders (NDDs) 15,16 , and complex trait heritability. Following ClinGen calibration recommendations for the ACMG/AMP guidelines, we establish that regions with less than 20% of their expected missense variation achieve moderate support for pathogenicity. We create a missense deleteriousness metric (MPC) that incorporates regional constraint and outperforms other deleteriousness scores at stratifying case and control de novo missense variation, with a strong enrichment in NDDs. These results provide additional tools to aid in missense variant interpretation.

3.
Nat Genet ; 56(1): 152-161, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38057443

RESUMEN

Recessive diseases arise when both copies of a gene are impacted by a damaging genetic variant. When a patient carries two potentially causal variants in a gene, accurate diagnosis requires determining that these variants occur on different copies of the chromosome (that is, are in trans) rather than on the same copy (that is, in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. Here we developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in the Genome Aggregation Database (v2, n = 125,748 exomes). Our approach estimates phase with 96% accuracy, both in trio data and in patients with Mendelian conditions and presumed causal compound heterozygous variants. We provide a public resource of phasing estimates for coding variants and counts per gene of rare variants in trans that can aid interpretation of rare co-occurring variants in the context of recessive disease.


Asunto(s)
Exoma , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Exoma/genética , Secuenciación del Exoma , Genotipo
4.
Nature ; 625(7993): 92-100, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38057664

RESUMEN

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.


Asunto(s)
Genoma Humano , Genómica , Modelos Genéticos , Mutación , Humanos , Acceso a la Información , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Frecuencia de los Genes , Genoma Humano/genética , Mutación/genética , Selección Genética
5.
bioRxiv ; 2023 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-36993580

RESUMEN

Recessive diseases arise when both the maternal and the paternal copies of a gene are impacted by a damaging genetic variant in the affected individual. When a patient carries two different potentially causal variants in a gene for a given disorder, accurate diagnosis requires determining that these two variants occur on different copies of the chromosome (i.e., are in trans) rather than on the same copy (i.e. in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. We developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in exome sequencing data from the Genome Aggregation Database (gnomAD v2, n=125,748). When applied to trio data where phase can be determined by transmission, our approach estimates phase with 95.7% accuracy and remains accurate even for very rare variants (allele frequency < 1×10-4). We also correctly phase 95.9% of variant pairs in a set of 293 patients with Mendelian conditions carrying presumed causal compound heterozygous variants. We provide a public resource of phasing estimates from gnomAD, including phasing estimates for coding variants across the genome and counts per gene of rare variants in trans, that can aid interpretation of rare co-occurring variants in the context of recessive disease.

6.
Nat Genet ; 54(5): 541-547, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35410376

RESUMEN

We report results from the Bipolar Exome (BipEx) collaboration analysis of whole-exome sequencing of 13,933 patients with bipolar disorder (BD) matched with 14,422 controls. We find an excess of ultra-rare protein-truncating variants (PTVs) in patients with BD among genes under strong evolutionary constraint in both major BD subtypes. We find enrichment of ultra-rare PTVs within genes implicated from a recent schizophrenia exome meta-analysis (SCHEMA; 24,248 cases and 97,322 controls) and among binding targets of CHD8. Genes implicated from genome-wide association studies (GWASs) of BD, however, are not significantly enriched for ultra-rare PTVs. Combining gene-level results with SCHEMA, AKAP11 emerges as a definitive risk gene (odds ratio (OR) = 7.06, P = 2.83 × 10-9). At the protein level, AKAP-11 interacts with GSK3B, the hypothesized target of lithium, a primary treatment for BD. Our results lend support to BD's polygenicity, demonstrating a role for rare coding variation as a significant risk factor in BD etiology.


Asunto(s)
Trastorno Bipolar , Esquizofrenia , Proteínas de Anclaje a la Quinasa A/genética , Trastorno Bipolar/genética , Exoma/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Esquizofrenia/genética , Secuenciación del Exoma
7.
Hum Mutat ; 43(6): 698-707, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35266241

RESUMEN

Exome and genome sequencing have become the tools of choice for rare disease diagnosis, leading to large amounts of data available for analyses. To identify causal variants in these datasets, powerful filtering and decision support tools that can be efficiently used by clinicians and researchers are required. To address this need, we developed seqr - an open-source, web-based tool for family-based monogenic disease analysis that allows researchers to work collaboratively to search and annotate genomic callsets. To date, seqr is being used in several research pipelines and one clinical diagnostic lab. In our own experience through the Broad Institute Center for Mendelian Genomics, seqr has enabled analyses of over 10,000 families, supporting the diagnosis of more than 3,800 individuals with rare disease and discovery of over 300 novel disease genes. Here, we describe a framework for genomic analysis in rare disease that leverages seqr's capabilities for variant filtration, annotation, and causal variant identification, as well as support for research collaboration and data sharing. The seqr platform is available as open source software, allowing low-cost participation in rare disease research, and a community effort to support diagnosis and gene discovery in rare disease.


Asunto(s)
Genómica , Enfermedades Raras , Exoma , Humanos , Internet , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Programas Informáticos
8.
Cell Genom ; 2(9): 100168, 2022 Sep 14.
Artículo en Inglés | MEDLINE | ID: mdl-36778668

RESUMEN

Genome-wide association studies have successfully discovered thousands of common variants associated with human diseases and traits, but the landscape of rare variations in human disease has not been explored at scale. Exome-sequencing studies of population biobanks provide an opportunity to systematically evaluate the impact of rare coding variations across a wide range of phenotypes to discover genes and allelic series relevant to human health and disease. Here, we present results from systematic association analyses of 4,529 phenotypes using single-variant and gene tests of 394,841 individuals in the UK Biobank with exome-sequence data. We find that the discovery of genetic associations is tightly linked to frequency and is correlated with metrics of deleteriousness and natural selection. We highlight biological findings elucidated by these data and release the dataset as a public resource alongside the Genebass browser for rapidly exploring rare-variant association results.

9.
Hum Mutat ; 43(8): 1012-1030, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-34859531

RESUMEN

Reference population databases are an essential tool in variant and gene interpretation. Their use guides the identification of pathogenic variants amidst the sea of benign variation present in every human genome, and supports the discovery of new disease-gene relationships. The Genome Aggregation Database (gnomAD) is currently the largest and most widely used publicly available collection of population variation from harmonized sequencing data. The data is available through the online gnomAD browser (https://gnomad.broadinstitute.org/) that enables rapid and intuitive variant analysis. This review provides guidance on the content of the gnomAD browser, and its usage for variant and gene interpretation. We introduce key features including allele frequency, per-base expression levels, constraint scores, and variant co-occurrence, alongside guidance on how to use these in analysis, with a focus on the interpretation of candidate variants and novel genes in rare disease.


Asunto(s)
Enfermedades Raras , Programas Informáticos , Bases de Datos Genéticas , Frecuencia de los Genes , Humanos , Enfermedades Raras/genética
14.
Nature ; 581(7809): 444-451, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32461652

RESUMEN

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.


Asunto(s)
Enfermedad/genética , Variación Genética , Genética Médica/normas , Genética de Población/normas , Genoma Humano/genética , Femenino , Pruebas Genéticas , Técnicas de Genotipaje , Humanos , Masculino , Persona de Mediana Edad , Mutación , Polimorfismo de Nucleótido Simple/genética , Grupos Raciales/genética , Estándares de Referencia , Selección Genética , Secuenciación Completa del Genoma
15.
Nature ; 581(7809): 452-458, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32461655

RESUMEN

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.


Asunto(s)
Enfermedad/genética , Haploinsuficiencia/genética , Mutación con Pérdida de Función/genética , Anotación de Secuencia Molecular , Transcripción Genética , Transcriptoma/genética , Trastorno del Espectro Autista/genética , Conjuntos de Datos como Asunto , Discapacidades del Desarrollo/genética , Exones/genética , Femenino , Genotipo , Humanos , Discapacidad Intelectual/genética , Masculino , Anotación de Secuencia Molecular/normas , Distribución de Poisson , ARN Mensajero/análisis , ARN Mensajero/genética , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Reproducibilidad de los Resultados , Secuenciación del Exoma
16.
Nature ; 581(7809): 434-443, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32461654

RESUMEN

Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.


Asunto(s)
Exoma/genética , Genes Esenciales/genética , Variación Genética/genética , Genoma Humano/genética , Adulto , Encéfalo/metabolismo , Enfermedades Cardiovasculares/genética , Estudios de Cohortes , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Humanos , Mutación con Pérdida de Función/genética , Masculino , Tasa de Mutación , Proproteína Convertasa 9/genética , ARN Mensajero/genética , Reproducibilidad de los Resultados , Secuenciación del Exoma , Secuenciación Completa del Genoma
17.
Nucleic Acids Res ; 45(D1): D840-D845, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899611

RESUMEN

Worldwide, hundreds of thousands of humans have had their genomes or exomes sequenced, and access to the resulting data sets can provide valuable information for variant interpretation and understanding gene function. Here, we present a lightweight, flexible browser framework to display large population datasets of genetic variation. We demonstrate its use for exome sequence data from 60 706 individuals in the Exome Aggregation Consortium (ExAC). The ExAC browser provides gene- and transcript-centric displays of variation, a critical view for clinical applications. Additionally, we provide a variant display, which includes population frequency and functional annotation data as well as short read support for the called variant. This browser is open-source, freely available at http://exac.broadinstitute.org, and has already been used extensively by clinical laboratories worldwide.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Exoma , Genómica/métodos , Navegador Web , Estudio de Asociación del Genoma Completo/métodos , Humanos , Programas Informáticos , Interfaz Usuario-Computador
18.
ACS Chem Biol ; 10(7): 1684-93, 2015 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-25856271

RESUMEN

Within a superfamily, functionally diverged metalloenzymes often favor different metals as cofactors for catalysis. One hypothesis is that incorporation of alternative metals expands the catalytic repertoire of metalloenzymes and provides evolutionary springboards toward new catalytic functions. However, there is little experimental evidence that incorporation of alternative metals changes the activity profile of metalloenzymes. Here, we systematically investigate how metals alter the activity profiles of five functionally diverged enzymes of the metallo-ß-lactamase (MBL) superfamily. Each enzyme was reconstituted in vitro with six different metals, Cd(2+), Co(2+), Fe(2+), Mn(2+), Ni(2+), and Zn(2+), and assayed against eight catalytically distinct hydrolytic reactions (representing native functions of MBL enzymes). We reveal that each enzyme metal isoform has a significantly different activity level for native and promiscuous reactions. Moreover, metal preferences for native versus promiscuous activities are not correlated and, in some cases, are mutually exclusive; only particular metal isoforms disclose cryptic promiscuous activities but often at the expense of the native activity. For example, the L1 B3 ß-lactamase displays a 1000-fold catalytic preference for Zn(2+) over Ni(2+) for its native activity but exhibits promiscuous thioester, phosphodiester, phosphotriester, and lactonase activity only with Ni(2+). Furthermore, we find that the five MBL enzymes exist as an ensemble of various metal isoforms in vivo, and this heterogeneity results in an expanded activity profile compared to a single metal isoform. Our study suggests that promiscuous activities of metalloenzymes can stem from an ensemble of metal isoforms in the cell, which could facilitate the functional divergence of metalloenzymes.


Asunto(s)
Alteromonas/enzimología , Escherichia coli/enzimología , Metales/metabolismo , Pseudomonas aeruginosa/enzimología , Salmonella/enzimología , beta-Lactamasas/metabolismo , Alteromonas/química , Escherichia coli/química , Hidrólisis , Metales/química , Modelos Moleculares , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo , Pseudomonas aeruginosa/química , Salmonella/química , beta-Lactamasas/química
19.
Structure ; 23(3): 571-583, 2015 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-25684576

RESUMEN

Mycobacterium tuberculosis (Mtb) uses the ESX-1 type VII secretion system to export virulence proteins across its lipid-rich cell wall, which helps permeabilize the host's macrophage phagosomal membrane, facilitating the escape and cell-to-cell spread of Mtb. ESX-1 membranolytic activity depends on a set of specialized secreted Esp proteins, the structure and specific roles of which are not currently understood. Here, we report the X-ray and electron microscopic structures of the ESX-1-secreted EspB. We demonstrate that EspB adopts a PE/PPE-like fold that mediates oligomerization with apparent heptameric symmetry, generating a barrel-shaped structure with a central pore that we propose contributes to the macrophage killing functions of EspB. Our structural data also reveal unexpected direct interactions between the EspB bipartite secretion signal sequence elements that form a unified aromatic surface. These findings provide insight into how specialized proteins encoded within the ESX-1 locus are targeted for secretion, and for the first time indicate an oligomerization-dependent role for Esp virulence factors.


Asunto(s)
Proteínas Bacterianas/química , Sistemas de Secreción Bacterianos/química , Mycobacterium smegmatis/química , Mycobacterium tuberculosis/química , Secuencia de Aminoácidos , Proteínas Bacterianas/fisiología , Sistemas de Secreción Bacterianos/fisiología , Transporte Biológico , Cristalografía por Rayos X , Enlace de Hidrógeno , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Cuaternaria de Proteína , Estructura Secundaria de Proteína
20.
Proc Natl Acad Sci U S A ; 112(6): E576-85, 2015 Feb 10.
Artículo en Inglés | MEDLINE | ID: mdl-25624472

RESUMEN

Unique to Gram-positive bacteria, wall teichoic acids are anionic glycopolymers cross-stitched to a thick layer of peptidoglycan. The polyol phosphate subunits of these glycopolymers are decorated with GlcNAc sugars that are involved in phage binding, genetic exchange, host antibody response, resistance, and virulence. The search for the enzymes responsible for GlcNAcylation in Staphylococcus aureus has recently identified TarM and TarS with respective α- and ß-(1-4) glycosyltransferase activities. The stereochemistry of the GlcNAc attachment is important in balancing biological processes, such that the interplay of TarM and TarS is likely important for bacterial pathogenicity and survival. Here we present the crystal structure of TarM in an unusual ternary-like complex consisting of a polymeric acceptor substrate analog, UDP from a hydrolyzed donor, and an α-glyceryl-GlcNAc product formed in situ. These structures support an internal nucleophilic substitution-like mechanism, lend new mechanistic insight into the glycosylation of glycopolymers, and reveal a trimerization domain with a likely role in acceptor substrate scaffolding.


Asunto(s)
Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Pared Celular/enzimología , Glicosiltransferasas/metabolismo , Modelos Moleculares , Staphylococcus aureus/enzimología , Ácidos Teicoicos/metabolismo , Proteínas Bacterianas/genética , Clonación Molecular , Cristalización , Estabilidad de Enzimas , Glicosiltransferasas/química , Glicosiltransferasas/genética , Espectrometría de Masas , Metales/análisis , Resonancia Magnética Nuclear Biomolecular , Polimerizacion , Conformación Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...